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• Abstract 

In the paper the Bayesian and the least squares methods of quantum state 
tomography are compared for a single qubit. The quality of the estimates are 
f-*) . compared by computer simulation when the true state is either mixed or pure. 

The fidelity and the Hilbert-Schmidt distance are used to quantify the error. 
^ ' It was found that in the regime of low measurement number the Bayesian 

method outperforms the least squares estimation. Both methods are quite sen- 
^ ! sitive to the degree of mixedness of the state to be estimated, that is, their 

^ ' performance can be quite bad near pure states. 

> : 

X ■ 1 Introduction 

The aim of quantum state estimation is to decide the actual state of a quantum system 
by measurements. Since the outcome of a measurement is stochastic, several measure- 
ments are to be done and statistical arguments lead to the reconstruction of the state. 
Due to some similarities with X-ray tomography, the state reconstruction is often called 
quantum tomography^ . More precisely, in physics-related books, journals and papers, 
tomography refers to both the state and parameter estimation of quantum dynamical 
systems, the term state tomography is used for the first, and process tomography is 
applied for the second case [TH El El- The engineering literature contains also papers 
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related to state and parameter estimation of quantum systems but they term it iden- 
tification for the case of parameter estimation |13t and state filtering for the case of 
state estimation jT^. 

In this paper the estimation of the state of a qubit is discussed. This is the sim- 
plest possible case of quantum state estimation where no dynamics is assumed and 
the measurements are performed on identical copies of the qubit. Therefore, the state 
estimation problem reduces to a static parameter estimation problem, where the pa- 
rameters to be estimated are the parameters of the density matrix of the qubit. 

The methods of classical statistical estimation are used to develop state estimation 
of quantum systems in the first group of papers El HZj. This approach suffers from 
the fact that the state estimation is usually based on a few types of measurement 
(observables) that are incompatible, thus there is no joint probability density function 
of the measurement results in the classical sense jH]. 

The most common way of statistical state estimation is the maximum-likelihood 
(ML) method that leads to a convex optimization problem in the qubit case (see below). 
The convex optimization methods are used in other approaches as well, see f2lE]- Here 
one can respect the constraints imposed on the components of the state but there is 
no information on the probability distribution of the estimate. 

The efficiency of the ML estimate, its asymptotic properties and the Cramer-Rao 
bound can be used to derive consequences on the asymptotic distribution of an estimate 
and on its variance. This approach has been used for optimal experiment design in|12j. 
A lower bound on the estimation error for qubit state estimation is derived in[7j. 

It is natural to require that any state estimation scheme should be unbiased and 
should converge in some stochastic sense to the true value if the number of samples 
(measurements done) tends to infinity. The basis of the comparison is then a suitably 
chosen measure of fit (for example averaged fidehties with respect to the true density 
matrix, or the variance of the estimate). The fidelity and the Bures-metric defined 
therefrom was used to derive optimal estimators of qubit state in[S]. Fidelity has also 
been used to evaluate the performance of an estimation scheme for the so called 
"purity" of a qubit (i.e. the length of its Bloch vector) in the context of Bayesian state 
estimation. 

Large deviations can also be used to analyze the performance of state estimation 
schemes [TT|. when the qubit is in a mixed state. An optimal estimation scheme is also 
proposed based on covariant observables. 

The aim of this paper is to investigate the properties of two state estimation meth- 
ods, the Bayesian state estimation as a statistical method and the least squares (LS) 
method as an optimization-based method by using simulation experiments. The sim- 
plest possible quantum system, a single qubit, a quantum two level system, is applied, 
where we could compute some of the estimates analytically. 
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2 Preliminaries about two level systems 



The general state of a two level quantum system is described by a density operator p, 
which is a positive operator on the Hilbert space C^, normalized to Trp = 1. On the 
one hand, p is represented in the form of a 2 x 2 matrix, and on the other hand by the 
so-called Block vector s = [si, S2, s^]'^. With use of the Pauli matrices 
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the correspondence between the density operator p and the Bloch vector s is given by 
the expansion 

P= -{I + Siai + 52(72 + S3O-3), 

where the constraint 

\\s\\ = ^sl + sl + sl<l (1) 

is satisfied. The correspondence between p and s is affine. Thus the state space of a 
spin system is represented by the three dimensional unit ball, called the Bloch hall. 

Observables, i.e. physical quantities to be measured, are represented by self-adjoint 
operators acting on the underlying Hilbert space^H]- A self-adjoint operator A has a 
spectral decomposition A = ^^^i KPi- The different eigenvalues Aj of the operator 
A correspond to the possible outcomes of the measurement of the associated observ- 
able and the ith outcome occurs with probability Prob (Aj) = TrpPj, where Pi is the 
projection onto the subspace of the corresponding eigenvectors. Consequently, the 
expectation value of the measurement is 

{A)p :=^A,Prob(A,) = TrpA 



3 Measurements on qubits 

For the state estimation, we will consider 3ri identical copies of qubits in the state 
p. On each copy in this passel, we perform a measurement of one of the Pauli spin 
matrices {ci, o"2, as}, each of them n times. The possible outcomes for each of this 
single measurements, i.e. the eigenvalues of the Uj, are ±1 and the corresponding 
spectral projections are given by 

Pt = \{I±^d- (2) 

For the sake of definiteness, we assume that first ai is measured n times, then ct2 and 
then (73. The data set of the outcomes of this measurement scheme consists of three 
strings of length n with entries ±1: 

D^ = {D^{j):j = l,... ,n} (z = 1,2,3). (3) 

The predicted probabilities of the outcomes depend on the true state p of the system 
and they are given by 

Prob {D-{j) = 1) = Tr {pP^) = \{l + (a,),) = ^(1 + (4) 
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4 Quality of the estimates 



As a measure of distance between two states of a system, i.e. between two density 
operators p and a;, the fidelity 



F{p,uj) = Tt y p^up2 (5) 
can be considered [13 Ej. It fulfills the properties 

F{p,u) = Fiu,p), 0<F{p,u)<l 
F{p,uj) = 1 <^=^ P = ^, F{p,uj)=0 <^=^ bj ± p . 

For spin 1/2 systems the fidelity can be calculated from the eigenvalues Ai and A2 of 

11 

the operator A = p2ujp2 as 

These eigenvalues can be computed from Tr A and Det (A) as 



Ai 2 = -Tr A ± W -Tr A - Det A. 
' 2 V 4 

If we express Tr A and Dety4 in terms of the Bloch vectors s (resp. r) of p (resp. a;), 
the fidelity can be written as 

F{p, cu) = i (^Vl + r-s + T - Vl + r-s-T^ , (6) 



where 



T = v/||r + s||2 + (r ■ s)2 



The quality of the estimation scheme for a true state p can be quantified by the 
average fidelity between the true state and the estimates cOi {1 < i < m): 



^ m 

$(p,m) := — y2F{p,u^). 



m 
1=1 



if m estimates are available. 

Alternatively, the Hilbert- Schmidt distance 

d{p,uj):= v/Tr(p-a;)2 (7) 



can be used as a measure. In terms of the Bloch vectors, this reduces to a/^~(s7^ 
The average Hilbert-Schmidt distance is given by 



^ m 

X{p,m) := — \2d{p,u,). 



m 



Remember that for an efficient estimation scheme must be small, while 

m) should be close to 1. 
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5 Bayesian state estimation 



First we give a brief summary of the Bayesian state estimation. In the Bayesian 
parameter estimation, the parameters 9 to be estimated are considered as random 
variables. The probabihty P{6 \ D"-) of a specific value of the parameters conditioned 
on the measured data is evaluated. Afterwards, the mean value of this distribution 
can be used as the estimate. 

If the measured data is a sequence of outcomes, as in our case, it can be split into the 
latest outcome of and D^~^, the preceding. Then the conditional distribution 

of the parameter becomes 

P{e I 

and the Bayes formula 



J P{b\u, c)P{iy\c) du 
can be applied resulting in the following recursive formula for P{6 \ 
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In our state estimation, we have three data sets D", i = 1,2,3, corresponding to 
the three directions, see Q. The estimation is performed for the three directions 
independently (and afterwards a conditioning has to be made). 

The probabilities P(A"(^) I ^) have the form 

PiD-in) I Dr\ s,) = P(±l I s,) = lTrp(l ± a,) = 1(1 ± .,). 
If we denote by £{i) the number of +l's in the data string D", then © becomes 

ns.i^ijw j(i(i + t,))«.)(i(i_i,))»-<(i)/50(j,)^j, 

where -P°(/^) is an assumed prior distribution, from which the recursive estimation 
is started. For the sake of simplicity we assume that -P°(i^) has similar form with 
parameters n and A in place of n and £, respectively. (These parameters might depend 
on i, but we neglect this possibility.) 



After a parameter transformation we have a beta distribution, 

p{s.m{u) = cr-±^) i-^ (10) 



where C is the normalization constant and u G [0, 1]. It is well-known that the mean 
value of this distribution is 

^ ^W + l + A 

n+K+2 ^ ' 

and the variance is 

(£(2) + l + A)(n-£(z) + l + K-A) 

(n + fi: + 2)2(n + «; + 3) ' ^ ' 
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The above statistics (fTTj) can be used to construct an unbiased estimate for Si in the 
form 

s, = 2'M±l±A^l (13) 

n+K+2 ^ ' 

after the re-transformation of the variables. 

Since the components of the Bloch vector are estimated independently, the con- 
straint (0) has not been taken into account yet. Thus, a further step of conditioning is 
necessary. We simply condition (si, 52,53) to ((T)): 

^ ^ /// Uif{ui)f{u2)f{u3) dui du2 dus ^^^^ 
JJJ fiui)f{u2)fiu3)duidu2du3 ' 

where both integrals are over the domain {{ui,U2, u^) : uf + U2 + ul < 1} and 

f{u,) := Pis,\D^)iu,) . 

Then the conditioned estimate of Sj will be 

2m, - 1 . 

The justification of the proposed conditioning procedure is the subject of another 
publication. 



6 Least squares state estimation 

We have the data set Q to start with. If 7rj(±) is the relative frequency of ±1 in the 
string D^, then the difference 

TTj := 7rj(+) - 7Vi{-) 

is an estimate of the zth spin component Si {i = 1,2,3). As a measure of unfit (esti- 
mation error) we use the Hilbert-Schmidt norm of the difference between the empirical 
and the predicted data according to the least squares (LS) principle. (Note that in this 
case the Hilbert-Schmidt norm is simply the Euclidean distance in the 3-space.) Then 
the following loss function is defined: 

3 

L{uj) = d^is,^) = ^{sj-TTjf = \\sf+ llvrf -2s-7r (15) 
i=i 

where s is the Bloch vector of the density operator uj. 

An estimate of the unknown parameters s = [si, S2, 53]"^ is obtained by solving the 
constraint quadratic optimization problem: 

Minimize L{uj) (16) 

subject to ||s|| < 1 (17) 

The above loss function is rather simple and we can solve the constrained minimization 
problem explicitly. In the unconstrained minimization, two cases are possible. First, 
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||7r|| < 1, and in this case the constrained minimum is taken at s = vr. When the 
unconstrained minimum is at vr with ||7r|| > 1, then it is clear from the 3-dimensional 
geometry that the constrained minimum is taken at 



7 Simulation experiments 

The aim of the experiments is to compare the properties of the above described least 
squares and Bayesian qubit state estimation methods. 

The base data of the estimation is obtained by measuring spin components (Ji,a2, 
and (73 of several qubits being in the same state i.e. having just the same Bloch vector 
s. The number of the measurements of each direction is denoted by n in what follows. 
The same measurement data had been used for the two methods. The Bayesian method 
was applied with conditioning and also without it to analyze its effect. 

The measurements were performed on a quantum simulator for two level systems 
implemented in MATLAB|inj. An experiment setup consisted of a Bloch vector s to 
be estimated and a number of spin measurements performed on the quantum system. 
The internal random number generator of MATLAB was used to generate "measured 
values" according to the probability distribution of the measured outcomes. In this 
way a realization of the random measured data set is obtained each time we run the 
simulator. Each experiment setup was used five times and the performance indicator 
quantities, the fidelity, the Hilbert-Smith norm of the estimation error and the empirical 
variance of the estimate were averaged. 

8 Results of the experiments 

The fidelity © of the real Bloch vector and the estimated one, variance of the estima- 
tions (fT^ . and the Hilbert-Schmidt norm ((7j) of the estimation error were the quantities 
which have been used to indicate the performance of the methods. 

8.1 Number of measurements 

The first set of experiments were to investigate the dependence between the perfor- 
mance indicator quantities and the number of measurements n. 

Fidelity. It was expected that the fidelity goes to 1 when n goes to infinity. Fig. 
[U shows the experimental results for estimating a pure state Spure = [0.5774, 0.5774, 
0.5774]"^. The result of the Bayesian estimation (dotted line) shows the weakest perfor- 
mance because of the conditioning feature of the method: the conditioned joint prob- 
ability density function gives worse estimation, than the original one (dashed line). 
On the other hand, the original Bayesian without conditioning tends to give defec- 
tive Bloch vector estimates with length greater than one. The price of the validity of 
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Fidelity as a lunctior of th9 number of measurements 
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Figure 1: Fidelity as a function of n for a pure state {spure) and a mixed state {smixed) 



the Bayesian method with conditioning is the precision for (near) pure states. It is 
apparent that the least squares estimation does not have the above problem. 

The situation is a little bit different for estimating mixed states {smixed = [0-3, —0.4, 
0.3]^). It can be seen that the two kinds of Bayesian estimation differ only for small n's. 
When n is greater than 25, the conditioning has no traceable effect, i.e. the Bayesian 
estimation with and without conditioning gives the same result. Least squares method 
also works a little bit better for mixed states than for pure states, at least for larger 
n's. It can be seen that pure states are a challenge for both methods but least squares 
handles this difficulty a bit better. 

In order to investigate more deeply the behavior of the estimates with low number 
of measurements we show the variation of the fidelities as a function of the number of 
measurements in the interval n = [5, 150] for both the pure and mixed states above (see 
Fig. 121). It was expected that the Bayesian estimates outperform the LS one for low 

Fidelity as a function of the number of measurements 
I ^ ^ Fidelity as a function of the number of measurements 




Figure 2: Fidelity as a function of low n for a pure state (spure) and a mixed state 

(■Smixed) 

number of experiments, but it is only true in the case of mixed states. For pure states 
the overly conservative conditioning of the Bayes method causes a bias. In addition, 
one can notice, that the effects related to the low number of measurements can be seen 
only when ra < 25. 

Hilbert-Schmidt norm. For Hilbert-Schmidt norm, it was expected to decrease 
to zero in the limit. The experiments seem to come up to expectations (Fig. In 
the case of pure states the same phenomena is noticeable as for fidelity. If one zooms 
on the low number of measurement region in Fig. El then the picture in Fig. |3 results. 
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Hilbert-Schmidl norm as a function of the number of measurements ^ , Hilbert-Schmidl norm as a function of tlie number of measurements 
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Figure 3: The Hilbert-Schmidt norm as a function of n for db pure St^tc (^Spufej cLnd. cl 
mixed state (smixed) 

Here we can see the same effects as for the fidehty, but in a less exposed way. Thus 
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Figure 4: The Hilbert-Schmidt norm as a function of low n for a pure state (spure) and 
a mixed state (smixed) 

fidelity seems to be a more sensitive indicator of performance than the Hilbert-Schmidt 
norm. 

Variance. The variance of the estimates were computed for the Bayesian estimation 
before conditioning. As it was expected, there is no apparent difference between the 
variance for the three spin components si, S2, and S3 and the variance decreases with 
n. The fact that the state to be estimated is a pure or a mixed state also does not 
have any effect on the result (Fig. Ej). The same effect can be seen if one focuses on 
the low number of measurement region, as seen in Fig. 

8.2 The length of the Bloch vector 

During the second set of experiments the length of the Bloch vector was varying. Its 
direction was s = [0.5774, 0.5774, 0.5774]-^. The expectation to fidelity was to be 
relatively independent of the Bloch vector length The experiment results can be 
seen in Fig. |7| The first picture shows the case n = 100, where, in spite of the big 
variance, the conditioned Bayesian shows an increase near the pure state (||s|| = 1). 
At n = 900 it is more apparent that LS and conditioned Bayesian methods (both have 
certain conditioning feature to avoid faulty estimates near = 1, see (fTHj) . (HH)) 
have worse performance near pure states. Fig. IHlshows fidelity between = 0.9 and 
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Variance as a function of Ihe number ol measurements 
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Figure 5: Variance as a function of n for a pure state (spure) and a mixed state {smixed) 
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Variance as a function of low n for a pure state {spure) and a mixed state 



|s|| = 1 for n = 900, where the above mentioned phenomena can be seen more clearly. 
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Fidelity as a lunction of the the Bloch vector length 
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Figure 7: Fidelity as a function of for n = 100 and n = 900 



As it was expected, the Hilbert-Schmidt norm seems to be constant for varying 
Bloch vector lengths. Fig. |H1 shows the simulation results. For relatively small n the 
variance is rather big but increasing the number of measurements it can be seen that 
the Hilbert-Schmidt norm is almost constant. Near = 1 there is a small increasing 
for the conditioned Bayesian method. 

The expectation for variance was to be independent of Bloch vector length. Fig. 
ITUl shows the results with the same variance-scale as in Fig. El The first graph is the 
results for 100 measurements, the other one is for n = 900. The result are in accordance 
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Fidelity as a function of the the Bloch vector length 
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Figure 8: Fidelity function of for n = 900 
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Figure 9: Hilbert-Schmidt norm as a function of for n = 100 and n = 900 
with Fig. El As it was expected, the two graphs can be regarded as constants. 



Variance as a function of the the Bloch vector length 
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Figure 10: Variance as a function of ||s|| for n = 100 and n = 900 



9 Conclusion 



The performance of two state estimation methods, the Bayesian state estimation as a 
statistical method and the least squares (LS) method as an optimization-based method 
is investigated in this paper by using simulation experiments. The fidelity and the 
Hilbert-Smith norm of the estimation error as well as the empirical variance of the 
estimate are used as performance indicator quantities. The variation of these quantities 
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as functions of the number of measurements and the length of the Bloch vector are 
computed. 

It is found that fidehty is the best indicator for the quahty of an estimate from the 
investigated three performance indicator quantities from both quahtative and quan- 
titative point of view. For state estimation of a single qubit the region of the 'low 
measurement number' being n < 25 and the 'large measurement number' n > 200 
has been determined experimentally. As for the comparison of the different state es- 
timation methods we have found that the Bayesian method could outperform the LS 
estimation only in the case of mixed states for low number of measurements (below 
n = 25). 

The investigated methods were found to be quite sensitive to the length of the Bloch 
vector, i.e. to the fact if a pure or mixed state was the one to be estimated. The 
methods that are not informed about the purity of the state can perform quite badly 
if they are used to estimate a pure state or a "nearly pure" state. 

It is also found that the way of conditioning is critical for the methods capable of es- 
timating both pure and mixed states. The simple length constraint of the least squares 
method (in (|18|)) seems to work quite effectively, thus a version of the Bayesian estima- 
tion method with LS-type constraining is a good candidate of an improved stochastic 
state estimation method. 

To handle somehow the difficulties related to estimating nearly pure states one should 
avoid to use a flat geometry on the state space but one should probably use a suitably 
deflned special Riemannian geometry instead. 
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